false start
AI prototypes for UK welfare system dropped as officials lament 'false starts'
Ministers have shut down or dropped at least half a dozen artificial intelligence prototypes intended for the welfare system, the Guardian has learned, in a sign of the headwinds facing Keir Starmer's effort to increase government efficiency. Pilots of AI technology to enhance staff training, improve the service in jobcentres, speed up disability benefit payments and modernise communication systems are not being taken forward, freedom of information (FoI) requests reveal. Officials have internally admitted that ensuring AI systems are "scalable, reliable [and] thoroughly tested" are key challenges and say there have been many "frustrations and false starts". Not all trials would be expected to make it into regular use, but two of those now scrapped had been highlighted by the Department for Work and Pensions (DWP) in its latest annual report as examples of how it had "successfully tested multiple generative AI proofs of concept". A-cubed was intended to help staff steer jobseekers into work.
R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents
Johnson, Daniel D., Tarlow, Daniel, Walder, Christian
Large language models show impressive results at predicting structured text such as code, but also commonly introduce errors and hallucinations in their output. When used to assist software developers, these models may make mistakes that users must go back and fix, or worse, introduce subtle bugs that users may miss entirely. We propose Randomized Utility-driven Synthesis of Uncertain REgions (R-U-SURE), an approach for building uncertainty-aware suggestions based on a decision-theoretic model of goal-conditioned utility, using random samples from a generative model as a proxy for the unobserved possible intents of the end user. Our technique combines minimum-Bayes-risk decoding, dual decomposition, and decision diagrams in order to efficiently produce structured uncertainty summaries, given only sample access to an arbitrary generative model of code and an optional AST parser. We demonstrate R-U-SURE on three developer-assistance tasks, and show that it can be applied different user interaction patterns without retraining the model and leads to more accurate uncertainty estimates than token-probability baselines. We also release our implementation as an open-source library at https://github.com/google-research/r_u_sure.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Overview (0.67)
- Research Report (0.63)
- Information Technology (0.47)
- Education (0.45)